BigBench Specification V0.1 - BigBench: An Industry Standard Benchmark for Big Data Analytics
نویسندگان
چکیده
In this article, we present the specification of BigBench, an end-to-end big data benchmark proposal. BigBench models a retail product supplier. The benchmark proposal covers a data model and a set of big data specific queries. BigBench’s synthetic data generator addresses the variety, velocity and volume aspects of big data workloads. The structured part of the BigBench data model is adopted from the TPC-DS benchmark. In addition, the structured schema is enriched with semistructured and unstructured data components that are common in a retail product supplier environment. This specification contains the full query set as well as the data model.
منابع مشابه
Discussion of BigBench: A Proposed Industry Standard Performance Benchmark for Big Data
Enterprises perceive a huge opportunity in mining information that can be found in big data. New storage systems and processing paradigms are allowing for ever larger data sets to be collected and analyzed. The high demand for data analytics and rapid development in technologies has led to a sizable ecosystem of big data processing systems. However, the lack of established, standardized benchma...
متن کاملA BigBench Implementation in the Hadoop Ecosystem
BigBench is the first proposal for an end to end big data analytics benchmark. It features a rich query set with complex, realistic queries. BigBench was developed based on the decision support benchmark TPC-DS. The first proof-of-concept implementation was built for the Teradata Aster parallel database system and the queries were formulated in the proprietary SQL-MR query language. To test oth...
متن کاملIntroducing TPCx-HS: The First Industry Standard for Benchmarking Big Data Systems
Discussion of BigBench: A Proposed Industry Standard Performance Benchmark for Big Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 Chaitanya Baru, Milind Bhandarkar, Carlo Curino, Manuel Danisch, Michael Frank, Bhaskar Gowda, Hans-Arno Jacobsen, Huang Jie, Dileep Kumar, Raghunath Nambiar, Meikel Poess, Francois Raab, Tilmann Rabl, Nishkam Ravi, Kai Sachs, Sapta...
متن کاملStar Schema Benchmark (ssb)
Big Data Analytics Benchmark (BigBench). Tags: pdgf Tags: star schema benchmark, ssb, parallel data generation framework, pdgf, benchmarking, skew. relational models which have been for a few years the most used to support classical data warehousing applications such as Star Schema Benchmark (SSB). Star. Schema Benchmark (6) is recently proposed datawarehousing benchmark that has been implement...
متن کاملCharacterizing BigBench Queries, Hive, and Spark in Multi-cloud Environments
BigBench is the new standard (TPCx-BB) for benchmarking and testing Big Data systems. The TPCx-BB specification describes several business use cases —queries— which require a broad combination of data extraction techniques including SQL, Map/Reduce (M/R), user code (UDF), and Machine Learning to fulfill them. However, currently, there is no widespread knowledge of the different resource require...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2012